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Abstract 

We introduce a searcli teclinique that is sensitive to a broad class of signals with 
large final state multiplicities. Events are clustered into large radius jets and jet sub- 
structure techniques are used to count the number of subjets within each jet. The 
search consists of a cut on the total number of subjets in the event as well as the 
summed jet mass and missing energy. Two different techniques for counting subjets 
are described and expected sensitivities are presented for eight benchmark signals. 
These signals exhibit diverse phenomenology, including 2-stcp cascade decays, direct 
three body decays, and multi-top final states. We find improved sensitivity to these 
signals as compared to previous high multiplicity searches as well as a reduced re- 
liance on missing energy requirements. One benefit of this approach is that it allows 
for natural data driven estimates of the QCD background. 
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1. INTRODUCTION 

The search for physics beyond the Standard Model is a central focus of LHC research. The 
motivations for extensions of the Standard Model are multi-faceted, spanning such diverse 
physics topics as the identity of dark matter, the radiative stability of the weak scale, the 
unification of forces, and the origin of the baryon asymmetry of the Universe. Apart from 
these specific theory inputs, which suggest certain classes of models, there is the generic goal 
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of thoroughly probing the weak scale for new physics — whatever that might be. In order to 
cover the vast range of possibilities for new physics that open up when the input from theory 
is loosened, it is imperative for the LHC to carry out an extensive experimental program 
that is sensitive to the widest possible range of new physics signatures. 

Since it is not possible to perform model independent searches for new physics — doing so 
is limited by both theoretical and experimental systematic uncertainties — it is necessary to 
design searches for new physics that are targeted at specific experimental signatures. Typi- 
cally such searches are based on exploiting a single key handle that significantly reduces the 
dominant Standard Model backgrounds, namely those originating from QCD jet production. 
For instance, it is common to require hard electroweak particles such as leptons or photons 
or (large amounts of) missing energy. Requiring b-tagged jets, massive jet resonances or 
extremely high energy events can provide alternative avenues for parametrically reducing 
the QCD background. These considerations exist both because of triggering requirements 
necessary for permanently storing events to tape and because realistic systematic uncertain- 
ties in any case limit the degree to which increased integrated luminosity leads to increased 
sensitivity. 

The past several years have seen increasing attention being paid to high multiplicities 
(i.e. requiring that events have more than ~ 6 final state jets) as a way of parametrically 
reducing QCD contributions to searches for new physics. Historically this approach was 
motivated by searches for black holes at the LHC (see e.g. ref. [I]), but more recently high 
multiplicity searches have been advocated as an effective technique for helping to probe 
scenarios with natural supersymmetry as well as those with baryonic R-parity violating 
supersymmetry [2H7] . High multiplicity final states also arise in theories of strong dynamics 
where new colored objects can produce four to eight top quarks through the production of 
"coloron" vector resonances or colored technipions [H [9] . Finally, some models introduced 
to explain the magnitude of the tt asymmetry measured at the Tevatron also predict large 
final state multiplicites |10j . 

One of the challenges of using high multiplicity as a handle for reducing QCD backgrounds 
is that the background rate is intrinsically difficult to calculate. The current state of the art 
for tree-level jet production is 2 — )■ 7, with 2 — )■ 6 being the most that is typically feasible. 
Significantly, these tree-level calculations have unquantified uncertainties in their rates and 
distributions. One of the computational challenges is that high multiplicity final states have 
enormous configuration spaces; for instance the 10 jet final state has 28 dimensions, with 
the consequence that is unfeasible to densely populate the configuration space with Monte 
Carlo events. Many of the configuration space variables, such as the angular separations 
between jets, ARij, have not been studied in depth and historically have been unreliably 
calculated with tree-level Monte Carlo. 

One way that high multiplicity backgrounds are estimated is by extrapolating from lower 
multiplicities. There are well-known approximate empirical scaling relations connecting the 
N jet production rate to the + 1 rate. There has been some progress in deriving these 
from first principles (see e.g. ref. [HI US]); nevertheless, this approach comes with large 
uncertainties in the rates that will cause these searches to quickly become systematically 
limited. Additionally, for large the pt spectrum for the iVth jet becomes increasingly soft 
and harder to measure accurately, leading to additional uncertainties. 

Recently alternative approaches to gaining sensitivity to high multiplicity final states 
have been developed. In effect these proposals factorize the problem: instead of directly 
counting jets, the event is clustered into a fixed number of large radius jets {N = 4, 5, or 
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6) whose substructure is then further scrutinized. Because the jet radius is large, these N 
"fat" jets will incorporate most of the radiation in the central region of the detector. The 
particles that would have formed multiple small radius jets in the traditional approach are 
clustered together into large radius jets. These fat jets will automatically have substructure, 
some of which will appear to originate from hard 1—7-2 parton shower splittings. While such 
splittings certainly occur in QCD, they occur relatively rarely with the result that requiring 
multiple fat jets to have multiple hard splittings helps to separate signal events with high 
multiplicities from the dominant low multiplicity QCD background. 

The fat jet approach has an additional benefit in that it may be better suited to getting a 
good handle on systematic uncertainties in the QCD backgrounds. Large radius jets, partic- 
ularly well separated ones, have properties that are relatively independent from one another, 
since their dynamics are largely driven by the parton shower, which is local in nature. For 
instance, the mass of one fat jet is not strongly correlated with the masses of other fat 
jets in the event. By exploiting the approximate independence of jets, QCD backgrounds 
can be estimated with a data driven analysis. Effectively the large configuration space we 
started out with has been factorized into a much smaller fat jet configuration space tied to 

(approximately identical) configuration spaces encoding the fat jets' substructure. This 
factorization lends itself to the measurement of appropriate jet templates, which can then 
be combined with fat jet distributions to arrive at background estimates. This approach 
is of obvious practical importance to experimental searches, but it is also useful for theo- 
retical calculations of the backgrounds, since searching for obvservables that reduce QCD 
backgrounds by six to ten orders of magnitude while acquiring enough statistics in Monte 
Carlo can be challenging to prohibitive. 

Specifically, this study builds on a recent paper Jl3\ that proposed Mj, the sum over fat 
jet masses, as an effective observable for separating high multiplicity signals from Standard 
Model backgrounds. Jet mass is the simplest jet substructure observable, but it is also 
one of the coarsest. This article advocates a more refined use of jet substructure to probe 
high multiplicity events, in particular by counting the number of subjets inside each fat jet. 
This may seem similar to clustering events into small radius jets and counting the resulting 
number of jets, but the potential to systematically estimate QCD backgrounds with a data 
driven approach distinguishes it from the traditional approach. 

This article is organized as follows. Sec. |2] presents the two subjet counting techniques 
introduced in this paper. Sec. [3] describes how the backgrounds were generated in Monte 
Carlo. Sec. |4] describes how the backgrounds were validated against ATLAS data and how 
the QCD backgrounds should be amenable to data driven estimates. Sec. [5] introduces 
eight benchmark signals and presents the expected sensitivity of our search strategy. A 
comparison to previous high multiplicity searches is also made. We conclude in Sec. [6] with 
some general discussion. 

2. COUNTING SUBJETS 

This section describes the two subjet counting techniques implemented in this study, 
and uqa- The former is a straightforward application of the exclusive kj- algorithm |Ti] . 
The latter counts subjets by recursively inspecting the structure of the Cambridge/Aachen 
[in] clustering tree of the fat jet. Both are implemented using FastJet 3 [L6l. As we will 
see, searches incorporating these methods yield improved sensitivity to the high multiplicity 
signals considered in Sec. [5j The two algorithms result in very similar expected limits, with 
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the difference that uqa does better than in cases where the QCD baclcgrounds are more 
important. 

2.1. Wide radius jets 

Wide radius jets are now a standard technique in high energy physics searches. The basic 
structure of our high multiphcity search is built around such fat jets and is common to both 
subjet counting techniques. First the event is clustered into fat jets with Rq = 1.2 using the 
anti-/cT jet algorithm [T^. Next the fat jets are trimmed [18J with the parameters r^nt = 0.3 
and /cut = 0.05. While fat jets are particularly sensitive to pile-up, making some sort of jet 
grooming necessary, it has been shown that jet substructure observables such as jet mass 
and N-subjettiness [19] have a significantly reduced sensitivity to pile-up effects with this 
choice of parameters [20]. Next the leading fat jet is required to have pt > 100 GeV, while 
subleading fat jets are required to have pt > 50 GeV. Only those events with four or more 
such fat jets are considered. Then the subjet count of each of the four leading fat jets is 
calculated using one of the two algorithms described below. Finally cuts are made on the 
observables 

4 4 

Mj = Y,mi Nj = Y,ni (1) 

i=l i=l 

and the missing transverse energy {^t)- 

2.2. Counting with 

The exclusive algorithm is defined via two metrics, dij and dis, and a dimensionful 
resolution parameter dcut [21]. The jet-jet metric dij and the jet-beam metric di are defined 
as: 

dij = min [pli,PTj] di = p%i (2) 

Here, pTi is the transverse momentum of protojet i and Ai?^^- = /S.rlfj + ^(p'jj- The exclusive 
mode of the algorithm proceeds by sequentially clustering pairs of protojets, stopping once 
all the dij and di are above d^ut- When the smallest metric is dij, i and j are combined. 
When the smallest metric is di, protojet i is set aside as a beam jet. The total number of 
subjets (in the exclusive sense) is then given by the number of protojets that are not beam 
jets and that remain once the clustering step has terminated.^ For our particular application 
we define n\^^ by taking 

= fk^PTJ, (3) 

where ptj is the total transverse momentum of the fat jet and /k^ is a dimensionless pa- 
rameter that we take to be given by 

/kx = 0.04 

^ The beam jets, in the rare case they appear in the reclustering of our fat jets, are soft and at the periphery 
of the fat jet so the fact that they are discarded is just as we should Uke. 
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throughout. This value of /k^ leads to good separation between signal and background, 
although for the range of signals considered the separation depends only weakly on the 
particular value used. Then for a given fat jet is taken to be the number of subjets 
identified by exclusive kx with 

Pt > Pt cut = 40 GeV. 

The dependence on the parameter pxcut is rather weak for the massive fat jets of interest, 
which contain softer subjets only infrequently. 

A significant advantage of this definition of is that a typical QCD jet will, due to its 
asymmetric energy sharing (a hard core surrounded by soft radiation), have a small number 
of subjets since much of the soft radiation will be clustered with the core (see Fig.Q. This 
is in contrast to a naive application of Cambridge-Aachen for reclustering, which can yield 
a large number of subjets even for a single-pronged QCD jet. 



2.3. Counting with Cambridge- Aachen 



The rikT algorithm introduced in Sec. 2.2 was entirely generic in its motivation and 



approach. The present method, denoted by uca, aims instead to count the number of hard 
partons consistent with the decay of a massive particle. The uqa algorithm is explicitly 
constructed to: 

• identify massive substructure; and 

• 'undercount' the number of subjets within a given fat jet if the energy sharing among 
the subjets is very asymmetric. 

The latter requirement is made because asymmetric sharing of energy between subjets is 
a telltale sign of subjets generated via the parton shower. The method is in the spirit of 
the various substructure algorithms that have emerged since the introduction of the BDRS 
procedure [22] and which make use of the information encoded in the clustering tree of the 
jet. In particular, it is closely related to an intermediate step in the HEPTopTagger |23l l2^ . 

We determine the number of subjets by unclustering the fat jet down to the mass scale 
rricut) throwing out subjets with an asymmetric energy sharing as defined by ?/cut- The 
number of identified subjets that then pass an additional pt cut yields ncA- In detail, the 
method is defined as follows: 

1. Cluster a given fat jet using the Cambridge/Aachen algorithm. 

2. To define uqa inspect the clustering tree of the fat jet via the following recursive 
procedure. 

3. Uncluster j into ji and j2 with pxi > Pt2- 

4. If nij < rricut or c/_R(ji, J2) < -Rmin consider j a subjet and exit the recursion. 

5. If Pt2 < Vent ■ {Pti + PT2) throw out ja- 

6. Continue the recursion on ji and (if it is retained) j2- 
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FIG. 1: Example of a fat (and very massive) QCD jet with pt = 910 GeV, m = 360 GeV 
and with its two n^j, subjets (left) and its four tica subjets (right) indicated. Note that 
this jet has not been trimmed to better illustrate the different treatment of soft radiation 
in ncA and Ukj. (the dark gray cells on the right do not belong to any identified subjets). 

7. When the recursion is complete, count the number of identified subjets with px > Pxcut; 
this number is uqa- 

So, for example, an idealized two-pronged jet initiated by the hadronic decay of an energetic 
Z boson would yield (supposing that the decay angle is such as to yield a roughly symmetric 
energy sharing) a count tt-ca = 1 for mcut > and uqa = 2 for mcut < ^z- 
Throughout this study we use the following parameters: 

meut = 30GeV, T/cut = 0.10, i?„,in = 0.15, pTcut = 30 GeV. (4) 

These values lead to good separation between signal and background, although for the range 
of signals considered the separation provided by uqa depends only weakly on the particular 
values used. 

2.4. Comparison of rikj, and ncA 

This section has introduced two distinct subjet counting techniques, and it is interesting to 
ask how they are related. A detailed comparison between the two algorithms is complicated 
by the fact that each is defined by several parameters. For simplicity we restrict ourselves 
to the parameter choices made above. Qualitatively, the two algorithms have a number of 
similar features. 

On a jet-by-jet basis, there are strong correlations between Ukj. and ncA, with tica typi- 
cally yielding more subjets than Ukj. . Fig. [T] illustrates a pronounced example of the tendency 
for ncA to identify more subjets. For a given ensemble of jets, it is useful to define the nor- 
malized distribution P{n) , which is the fraction of jets with n subjets. The left panel of Fig. [2] 
shows P{n) for both algorithms for a sample of leading QCD jets generated by MadGraph and 
with Pt > 100 GeV. Also illustrated is P{n) for a sample^ of leading jets drawn from signal 
events where pair produced gluinos decay via g — )• tix {rrig = 600 GeV and = 60 GeV). 



^ The MadGraph and signal-like samples, which are used throughout this section, are described in more 
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FIG. 2: Left: The subjet distributions log^o -Pn^^ (^fcr ) and log^g -PncA('^CA) in blue and red, 
respectively. Right: The blue and red distributions show SfincAi^kr) and ^/ir^.^ (^ca), 
respectively, which are defined in Eq. [7| The error bars on 5/i„j,^(nfcj,) and 5jjin^^{nQx) 
correspond to the standard deviations a^^^^ and an,.^ . For both panels the solid lines 
correspond to a sample of leading QCD jets generated by MadGraph and with 
Pt > 100 GeV, while dashed lines correspond to a signal-like sample described in the text. 



To study how n^^, and ncA are correlated, it is useful to introduce the joint distribution 
P{nkj,, uca) with ^ P(rafcj,, kca) = 1 (5) 

nkj,,ncA 

From the joint distribution one can define the mean of ncA as a function n^^, as well as the 
mean of as a function of uca- 

/incAl^fcr) = y^ncA-P(^fcT,^CA) and /irtfc^ (ricA) = y^nfcyP(nfc^,ncA) (6) 

"CA "fey 

From this one can then define the quantities 

Sf^ncAi^kr) = f^ncA " ^^T and ^/in^^ (ncA) = /^n^^ " ^CA (7) 

which are shown in the right panel of Fig. [2] It can be seen that that Ukj, and uqa track 
one another pretty closely for small n, but that for larger numbers of subjets uqa tends to 
pull further and further ahead of n^^. The correlation between n^^, and ncA is somewhat 
tighter in the signal-like sample than in the QCD sample. 

For fixed rik^ the distribution P(nfcy, tica) is a function of uqa with standard deviation 
(Tnc^inkj,)- The standard deviation ancj^iukj) is a steadily rising function of ra^j,, as illus- 
trated by the error bars in the right panel of Fig. |2} For the QCD sample it grows from 
ancj^iX) — 0.3 to (T„Q^(6) ~ 1.0, indicating that for small n the two algorithms identify 
very similar numbers of subjets, but that as the amount of substructure grows there is less 



detail in Sec. 3.2.1 and |5.1[ respectively. The MadGraph samples are used because of the superior statistics 
available. 
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nkT rikT 

FIG. 3: Scaling patterns for the subjet distributions of leading QCD fat jets generated by 
MadGraph. Left: The ratio w{n) for n^j, (bottom) and uqa (top) for jets with 
Pt > 100 GeV (blue) and px > 500 GeV (red). "Staircase scaling" corresponds to a flat 
w{n). Right The ratio r{n) for n^^ (bottom) and ncA (top) for jets with p^ > 100 GeV 
(blue) and px > 500 GeV (red). "Double staircase scaling" corresponds to a fiat r{n) and 
appears to be emerging in the high p^ regime. 



agreement between the two algorithms. Similarly the standard deviation cr„j.^ (tzca) grows 
approximately linearly from an^^{l) — 0.4 to o"„^^(8) ^ 0.9. Note that this dispersion is 
logically distinct from the divergence in the mean seen in Sfincj^in^^). 

Note that both hca and peaked at 3 for the leading jet of the signal (see Fig. 

This is as expected, since this signal contains up to 12 final state quarlcs, with the result 
that, if the leading four fat jets capture all of the decay products, an average of 3 subjets 
per fat are expected. 

Interestingly, the subjet distributions for QCD fat jets are not governed by approximate 
"staircase scaling," as one might have expected. This is a scaling pattern defined by the 
condition that, if we define the ratio 

, , P(n + l) 
u.(n) . (8) 

then w{n) is a constant independent of n. Instead, a significantly steeper distribution is 
seen. The variable 

^ w{n + l) _ P{n)P{n + 2) 

w{n) P(n + 1)2 ^""^ 

is useful for characterizing deviations from constant with staircase scaling corresponding 
to r = 1. For the case of QCD fat jets the ratios w{n) and r{n) for both n^^ and ricA are 
illustrated in Fig. [3} Since there are intrinsic energy scales in both Ukj, and ricA, low pr jets 
are sensitive to these scales. For px > 100 GeV, the w{n) distribution drops rapidly before 
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asymptoting to a constant of WncA — 0-05 and Wn^.^ ~ 0.03, indicating staircase scaling with 
a very hard spectrum. Requiring higher jets makes the influence of the intrinsic scales in 
the counting algorithms less relevant. If the minimum px of the jet is raised to 500 GeV, the 
scales inside the subjet counting algorithm will play a much smaller role. Even for these high 
Pt jets staircase scaling is not observed; instead w{n) appears to have a staircase behavior 
and r(n) is approximately constant with r^cA — 0-5 and r^^.^ ^ 0.4. This can be called 
"double staircase scaling" and indicates a distribution of subjets that scales like 

P{n) ~ r"'/2. 

This is in contrast to the more traditional staircase scaling that gives 

P{n) ~ 

demonstrating that subjet counting is not related in a straightforward manner to jet count- 
ing. Deviations from staircase scaling is well known and have even been explained in [T2] ; 
however, the deviations from staircase scaling noted here do not appear to be "Poisson" 
which predicts that r should asymptote to unity which does not appear to happen. 



3. MONTE CARLO CALCULATIONS 

This article studies the use of the total number of subjets in an event as a way to 
separate new physics scenarios that produce many final state quarks and gluons from QCD 
and electroweak-scale backgrounds. The ultimate goal is to reduce the reliance upon missing 
transverse energy {$t) and lepton requirements so as not to veto on signals that have neither, 
while still obtaining relatively low background search regions. Since and leptons are the 
standard handles used to reduce QCD backgrounds, it is particularly important to have 
reliable estimates of multijet production rates and differential distributions. Of course, the 
same holds true for the non-QCD backgrounds. Consequently this section and the next play 
an important role in everything that follows. Throughout, the calculations are performed 
at a center of mass energy of i/s = 8 TeV. 



The rest of this section is organized as follows. Sec. 3.1 describes the calculation of non- 



QCD backgrounds such as V+jets and tt +jets. Sec. 3.2 describes the two approaches used to 



generate QCD Monte Carlo events. Sec. |3.3| describes how detector effects and jet clustering 
are implemented. These latter two subsections are complemented by the discussion in Sec. [4], 
which focuses on data driven methods. 



3.1. Non-QCD Backgrounds 

The dominant Standard Model backgrounds are QCD, y+jets and tt+jets. Since, how- 
ever, any backgrounds where there is an intrinsic mass scale are potentially important, the 
leading subdominant backgrounds were also computed for completeness, see Table [TJ The 
non-QCD backgrounds used for our analyses were generated using MadGraph 4.5.1 [25V 
|27] and showered and hadronized using PYTHIA 6 . 4 [28] . The five-flavor MLM matching 
scheme with a shower-A;_L scheme was used to account for the extra radiation |29j . 

It is the high pt tails of these backgrounds that will make the dominant contribution to 
signal regions. Since some of these backgrounds have quite large cross sections, it is not 
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1.0 X lO'' 
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TABLE 1: The non-QCD backgrounds used in this analysis. The subscript i in rii 
indicates the highest jet multiphcity considered in the matched sample. Thus ti + is 
tt + Oj, ti + Ij, ti + 2+j, where the last jet multiplicity is an inclusive process that can 
include higher jet multiplicities generated through the parton shower. The px slicing is 
with respect to the leading massive object. The two samples marked with a * denote that 
resonant top production is excluded to avoid double counting. The last column indicates 
the ratio between the expected number of events at 30fb~^ and the number of Monte 
Carlo events generated. 

feasible to generate 30 fb~^ worth of Monte Carlo. Instead, the backgrounds are generated 
in different pr bins of the heavy particle, pr heavy, where a heavy particle is any one of t, 
W^, Z^, or H^.^ This ensures that the regions of phase space that result in the largest con- 
tamination of the signal region are computed with sizeable significance. Another important 
consideration is that events with small pt heavy can still have large Ht as a consequence of 
high jet activity. Since these backgrounds are potentially important for regions of phase 
space with sizeable it is important for them to be computed with sufficient significance. 
However, if px heavy is small, then so is the intrinsic with the consequence that the 
high Ht /low Pt heavy region of phase space is not important for the signal region because 
it is subdominant to the QCD contribution. In practice, the large event weights of the low 
Pt heavy Tcgious do uot limit the statistical accuracy of the background estimate. 

The resulting backgrounds are shown in Table [T| where the subscript i in rii indicates 
the highest jet multiplicity considered in the matched sample. The different Pt heavy bins 



^ When there are two or more heavy particles in the event, e.g. tt, px heavy denotes the larger pT- 
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are listed in the third column. The five fiavor matching scheme introduces one additional 
complication for diboson and single top production. This arises because there can be res- 
onant top production in these two channels that has already been included in ordinary tt 
production. In order to exclude this double counting, a requirement of |mj — mj,]y\ > ISFj 
is imposed. 

The two most important non-QCD backgrounds for our search are V^+jets and tt+jets. 
These backgrounds not only have large cross sections but are also jet rich and result in a 
reasonable amount of Backgrounds like titi, tiV and tiH, which have jet multiplicities 
and $rp comparable to that of some of the benchmark signals we study, have small cross 
sections and make a negligible contribution to the total background (although we include 
them for completeness). 



3.2. QCD 

Several techniques are used to calculate the QCD contribution to signal regions. Sec. |3.2.1 



describes a calculation of the QCD background using an MLM-matching scheme imple- 
mented in MadGraph and PYTHIA using unweighted events with up to four partons matched. 
This is a relatively low multiplicity method of generating backgrounds and relies heavily on 
the parton shower to generate high multiplicities; nevertheless, it is a standard calculational 
method that makes it easy to to get good Monte Carlo statistics over the entire signal re- 



gion. Sec. |3.2.2| describes a calculation of the QCD background using a CKKW-matching 
scheme implemented in SHERPA using weighted events with up to six partons matched. This 
method allows significantly higher multiplicities to be generated and samples the high en- 
ergy and high multiplicity tails with weighted events. The use of weighted events tends to 
hurt convergence and gives relatively poor Monte Carlo statistics. 



3.2.1. MadGraph +PYTHI A 

One set of QCD backgrounds was generated with MadGraph 4.5.1 [25H27] at 0{afj 
and showered with PYTHIA 6 . 4 [28j using MLM matching [29j . The events were generated 
in multiple exclusive samples varying both the matrix-element parton multiplicity from 
rij = 2 to Uj = 4 and the scalar transverse energy Ht using the 5-fiavor matching scheme 
[5U] . It is not practical to generate multiplicities higher than nj = 4 in MadGraph due to 
computational limitations. Since the event selection for our search strategy requires rij > 4, 
all jet substructure will be modeled by the parton shower. This is clearly insufficient for 
gaining much confidence in the resulting background estimates. Nevertheless, the MadGraph 
events will serve as a useful crosscheck in what follows. In addition, the high statistics 



available from MadGraph will be very useful in creating missing energy templates in Sec. 4.1 



3.2.2. SHERPA 

The second event generator used to generate QCD backgrounds is SHERPA 1.4.0 j3TH35] . 
SHERPA uses CKKW matching [36j and is capable of generating up to rij = 6 at the matrix 
element level. SHERPA can therefore generate more hard substructure using matrix elements 
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without relying on the parton shower. Consequently we will primarily be relying on SHERPA 
for our QCD background estimates. This sample was generated and fully described in [47 

One of the drawbacks of SHERPA is that it is not straightforward to generate separate 
samples binned by Ht- Instead weighted Monte Carlo events can be generated. The main 
problem with relying on weighted Monte Carlo is that it becomes more difficult to obtain 
convergence in the signal region. One frequently observes that a single (or handful of) 
high weight event (s) can make large contributions to the tails of distributions, with the 
consequence that statistical uncertainties in the tails become large. 

Two separate weighting methods were used in generating the QCD backgrounds. The 
first is the default weighting procedure in SHERPA. The second skews the weights towards 
higher multiplicities, with 

for 2 < rij < 6. Thus, 256 times as many rij = 6 events are generated as compared to rij = 2 
events. This allowed for the generation of relatively more high multiplicity events so as to 
better fill out the tails of the distributions. Together these two weighting methods resulted 



in a total of 4.8 x 10 events passing our basic fat jet requirements (see Sec. 2.1). 



3.2.3. Comparison o/MadGraph and SHERPA 

In Figs. |4] and [5] the SHERPA results are compared to MadGraph+PYTHIA. We see that 
there is generally good agreement between the two. Even the subjet count Nca shows good 
agreement up to A^^ca = 8, a regime in which both generators (especially MadGraph) are 
relying on the parton shower to generate substructure. The biggest differences appear in 
the tails of the distributions. As discussed above, the presence of high weight events in 
SHERPA can lead to poor convergence. Whenever there is a large disagreement between the 
two generators in these figures, it seems to be largely driven by this effect (c.f. the large 
statistical uncertainties in the regions of largest disagreement). 

The disagreement in the tails of the distributions between SHERPA and MadGraph + 
PYTHIA deserves further comment. While the naive expectation is that SHERPA should pro- 
vide a better description in the high H^, Mj and Nj regime, there are significant statistical 
uncertainties in the SHERPA sample arising from slow convergence of the weighted Monte 
Carlo calculations. The features of the distributions also appear to be slightly pathological 
in their behavior, appearing as infiection points in regions that one would expect to be rela- 
tively smooth. In the case of A^^ca, some of the bins are not even monotonically decreasing. 
This article will rely on the SHERPA calculation for its background estimates; fortunately 
the design of the search regions in Sec. [5] will not be heavily infiuenced by these features. 
In Sec. |4| a data driven approach to calculating these backgrounds is presented, and the 
disagreement between the data driven approach and the straight Monte Carlo calculation 
will be, at least in part, related to these same features. 



^ We thank Tim Cohen, Mariangela Lisanti, Eder Izaguirre, and Tim Lou for letting us use this Monte 
Carlo sample. 
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FIG. 4: Comparison between SHERPA (red) and MadGraph+PYTHIA (blue) for three relevant 
kinematic variables of the leading jet. 




FIG. 5: Comparison between SHERPA (red) and MadGraph+PYTHIA (blue) for Ht (upper 
left), Mj (upper right), A^ca (lower left) and A^^^ (lower right). For the definition of the 
latter two observables see Sec. |2} For the bottom two distributions, the MadGraph+PYTHIA 
points have been shifted by half a unit to the right in order to facilitate the comparison. 

3.3. Detector mockup and jet clustering 

After showering, all hadron-level events are passed to the PGS 4 |37] detector simulation, 
which parameterizes the detector response. The detector parameters used are those of 
the default ATLAS PGS card. The PGS output is clustered into 0.1 x 0.1 cells in r/ — 
space, and then each cell is represented as a massless four-vector pseudoparticle. Finally, 
those pseudoparticles with rapidities \y\ < 2.5 are fed into FastJet 3, which we use for jet 
clustering [16]. 
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The primary purpose of PGS is to give estimates of the missing energy that arises from 
imperfect detectors. As we will see in Sec. 4.1, for the QCD backgrounds this is best 



accomplished by parametrizing the PGS QCD missing energy spectrum in terms of template 
functions. In Sec. 4.1.1, our Monte Carlo background calculations are compared against 
published ATLAS data. There we will see that the QCD missing energy templates will 
need to be rescaled to obtain a better fit to the data. PGS is also useful for simulating 
lepton identification efficiencies. However, for the primary proposal of this paper, no lepton 
requirements or vetoes are made, and so lepton identification efficiencies will not play a large 
role. 



3.4. Treatment of leptons 



The goal of this paper is to develop a search strategy that can dramatically reduce 
Standard Model backgrounds while making the least number of assumptions about the 
characteristics of the final state apart from its having a high multiplicity. Consequently, 
while it may be advantageous to require or veto on something like b-tagged jets or isolated 
leptons for a particular signal model, we do not do so here. For the broad class of signals 
we are interested in probing, the high final state multiplicity may be exclusively hadronic in 
origin or it may include some number of leptons. For models with multiple possible cascade 
topologies (or indeed any with top quarks or electroweak gauge bosons in the final state) 
both the hadronic and semi-hadronic modes may be simultaneously present, and signal 
discovery may require sensitivity to both channels. Consequently throughout this study 
we treat leptonic energy as hadronic energy. That is to say that in both fat jet clustering 
and subjet counting, hadronic and leptonic energy are treated democratically. This helps to 
ensure that signal efficiencies are not unnecessarily degraded. It is interesting to ask whether 
alternative treatments of the leptons might lead to effective search strategies without having 
to sacrifice the relative inclusiveness of the present search. Doing so, however, lies outside 
the scope of this paper. 



DATA DRIVEN BACKGROUNDS 



High multiplicity searches push into regions of phase space that are challenging to model, 
particularly in the case of the pure QCD backgrounds. For this reason it is important to 
have as many handles on the backgrounds as possible. In particular a data driven extrap- 
olation of the background from a control region to the signal region would be especially 
valuable for corroborating background estimates available from Monte Carlo. In this section 
we explore how such a data driven estimate might be made. In Sec. 4J^ we specialize to the 
particular case of missing energy. The resulting missing energy templates allow us to achieve 
significantly improved statistics in our Monte Carlo estimates of QCD missing energy ac- 
ceptances. In Sec. |4.1.1| we compare our Monte Carlo background estimates to published 



ATLAS data. Finally in Sec. 4.2 we discuss the possibility of extending the template method 



to take into account Mj and Nj. These latter results are preliminary and are not used for 
the background estimates that enter in the expected limits in Sec. [5] 
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trlVHr (GeV^) 



FIG. 6: The missing energy significance, y = $rp / a/ ( GeV2) in three different Ht bins: 
[300, 600] GeV (blue), [900, 1200] GeV (green), and [1500, 1800] GeV (red). 



4.1. QCD missing energy templates 

One of the purposes of this work is to reduce the dependence of new physics searches on 
missing energy requirements, which are particularly effective in reducing QCD backgrounds. 
Thus it is important that we model ^"^^ particular QCD $rp^ as accurately as possible. 

QCD missing energy typically arises from two distinct sources at the LHC. The first is 
from neutrinos lost in semi-leptonic decays of bottom and charm quarks. This irreducible 
form of missing energy gives a long non-Gaussian tail to missing energy distributions but can 
be estimated through Monte Carlo calculations. The second form of missing energy arises 
from detector effects that result in particles being lost or otherwise mismeasured. This form 
of missing energy is usually parameterized as a response function on the jet-by-jet level and 
is typically Gaussian for several standard deviations. The typical amount of missing energy 
scales as the square root of the jet energy, although there is a small linear term that takes 
over at large jet energies. For QCD events it is this latter form of missing energy, which 
arises from detector effects, that is dominant. 

This article uses the approach of creating a probability distribution function for the 



missing energy of QCD events as a function of Ht (c.f. Sec. 4.2). This allows for a huge 
reduction in the number of Monte Carlo events necessary for accurate background estimates 
in the presence of missing energy requirements. Because the detector response is orthogonal 
to other jet properties that will be used (and is in any case being parametrized by PGS) this 
approach should faithfully reproduce the results of a much larger Monte Carlo calculation. 
Specifically, a probability distribution function for missing energy significance, 

as a function of Ht is constructed from the unweighted MadGraph QCD sample: 

/■oo 

Vmet {y;HT) with / dy VuETiy, Ht) = 1 (11) 

Jo 

Thus for the rest of this study (where SHERPA will be used for all our QCD background 
estimates), although the Ht distribution will be modeled by SHERPA the mapping Ht — y f^T 
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3.23 2.4 ± 0.9 


0.01 0.03 ± 0.03 
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0.05 — 


5 1700 0.15 
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0.42 — 


0.06 0.14 ±0.07 


0.84 0.44 ±0.28 


0.18 0.07 ±0.09 


6 1300 0.25 


0.18 0.10 ±0.09 


0.08 0.16 ±0.19 


0.03 — 


0.54 0.34 ±0.24 


0.00 — 


6 1000 0.30 


0.18 0.34 ±0.17 


0.13 0.14 ±0.22 


0.05 — 


0.64 0.43 ±0.16 


0.00 — 



TABLE 2: Comparison between backgrounds and ATLAS data driven estimates [38]. The 



QCD backgrounds are generated with SHERPA as described in Sec. 3.2.2 whereas the 



others are generated with MadGraph + Pythia as described in Sec. 3.1 



will be modeled by a combination of PGS and MadGraph. We use MadGraph for this purpose 
because doing so yields much better statistics (the ability to target specific Ht bins is one 
advantage). This PDF can be used event-by-event to determine the probability that an 
event, Cj, passes a specific $j' requirement: 



$1 



dy VuETiy, Hxe,) 



(12) 



Fig. [6] shows the missing energy significance distributions for various Ht windows. Note 
that these PDFs are just another way of parametrizing PGS's detector response. We will see 



below, in Sec. 4.1.1, that it will be necessary to scale these PDFs to ensure a better fit to 
published ATLAS data. 

Looking forward to Sec. |4.2 the factorization we have introduced is given succinctly by 



da{$T. Ht) = Pmet(^t/v^; HT)da{HT) 
which can be compared to the analogous expression in Eq. [18] below. 



(13) 



4-. 1.1. Background validation 
We validated our backgrounds against the signal regions used by an A TLAS search for 



squarks and gluinos ^38]. This lead us to rescale our $t PDFs (see Sec. 4.1 and below), since 
the ATLAS PGS parameters were resulting in too large QCD $t acceptances. The results 
after the rescaling are shown in Table [2] Overall, our Monte Carlo background estimates 
agree with the data driven estimates given in the ATLAS study. In those cases where there 
is disagreement, our Monte Carlo always overestimates the ATLAS estimates so that the 
expected sensitivities presented in Sec. [5] should be reasonably conservative. 

Each of the ATLAS signal regions includes a lepton veto. Because ATLAS lepton iden- 
tification is only approximately reproduced by PGS, we expect larger differences between 
Monte Carlo estimates and ATLAS estimates for those backgrounds with leptons. This is 
indeed what we find in Table |2| where backgrounds with no leptons (like Z°/7*+jets) match 
the data well. For events with $t arising from the decay of bosons, such as VT^+jets 
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and tt+jets, the comparisons are systematically off because PGS is not identifying isolated 
leptons with sufficiently high efficiency. This is not a major issue in the validation because 
the searches described in this article neither require nor veto on leptons. 

As mentioned above, the PGS treatment of angular and energy smearing does not faithfully 
reproduce the response of the ATLAS detector. In particular, in the presence of $j' cuts 
the ATLAS PGS parameters result in an overestimate of QCD backgrounds by an order 
of magnitude solely due to the treatment of In order to better reproduce the QCD 
backgrounds, the templates are rescaled by making the replacement 

[ $T \ $T \ 

VmKT rri^ \ Ht VmET 7==', Ht (14) 

WHt J \ay/HT J 

The best fit to ATLAS estimates for QCD backgrounds in ^^^-rich regions is with a = 0.8. 
This rescaled template leads to significantly improved agreement between the Monte 
Carlo and ATLAS estimates of the QCD background. 



4.2. Fat jet templates 

The main obstacle to modeling 4-jet QCD production is the large dimensionality of the 
space of observables under consideration. The quantity we would like to understand is the 
9-dimensional 4-(fat)jet differential cross section (icr4j(^r, m,, n^). Here is the missing 
energy of the 4-jet event, the rrii are the masses of the four jets, and the are the four 
subjet counts (using e.g. the algorithm defined in Sec. |2]). With da^j in hand the chosen 
cuts on 

Mj = Y^mi and Nj = ^ni (15) 

i i 

can be imposed, thus yielding the expected QCD background in the signal region. 

To make progress it is useful to reduce the dimensionality of the problem. This can 
be done by making the assumption that each of the four jets is governed by a universal 
probability distribution 

pj{x,n;pT) (16) 

with 

X = m/pT 

which describes the probability of a fat jet having n subjets and a particular value of m/pT, 
as a function of the fat jet pt- The mass of a fat jet is correlated with the number of subjets 
it contains, since it is impossible to get multiple subjets without having a sizable mass; 
consequently pj does not factorize and it is necessary to construct a two dimensional PDF. 

The assumption of universality is not completely valid, for one reason because quark- 
initiated jets and gluon-initiated jets will have different distributions. The hope is that 
ensembles of jets will have similar ratios of quark- versus gluon-initiated jets and that the 
distribution functions will not be radically different. In practice, this is the case since it 
is challenging to distinguish quark-initiated jets from gluon-initiated jets and since it is 
difficult to construct selection criteria that isolate one from the other. The assumption 
of universality is even more aggressive, however, since it implies that these distribution 
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functions are independent of their environment. This assumption is known to be violated 
to some degree, particularly as jets come closer together, but the pull of the environment 
on properties of fat jets tends to be less than C(10%) in magnitude.^ In this section we will 
be satisfied to check these assumptions empirically with Monte Carlo calculations, leaving 
a more detailed study to future work. Note that we will not be applying the resulting 
background estimates when calculating the estimated sensitivity of our search strategy: 
the results presented in Sec. [5] will use the SHERPA Monte Carlo calculations described in 
Sec. 3.2.2[ The results in this section are a first attempt at studying more aggressive uses 



of data driven approaches to QCD backgrounds. 

The assumption underlying the form of this jet template is that a jet's substructure 
(e.g. its mass) is determined by its px and is independent of other jets in the event. The full 
4-jet distribution is then obtained via the product: 

4 

da4j{$T,mi,ni;pTi) = daij{$T,PTi)Ylpj{xuni;pTi) (17) 

i=l 

Here the pTi are the transverse momenta of the four jets. Thus the 9-dimensional distribu- 
tion daij{$rp)^ii^i) has been re-expressed as a function of the 5-dimensional distribution 
(i(T4j(^r,PTj) and the 3-dimensional jet template pj{x,n;pT)- A further reduction can be 
made by assuming that only depends on the quantity = Yl Pti- With this assumption 
we can introduce the $t template VuETiy', Ht), with 

y = $t/H^ 

thus ending up with the factorization 

4 

da4j{$T,'>TT'i,ni; HT,PTi) = da4,j{pTi)'PMET{y; Ht) Y\_Pji^i^'^i'^PTi) (18) 

i=l 

Ultimately it is an experimental question whether such a factorization holds. At some level 
we certainly expect correlations between the four jets and deviations from the form of Eq. 18 
For example, we would expect correlations to arise from color (re) connections as well as out- 
of-jet radiation. The presence of significant pile-up (so long as it remains unsubtracted) 
would also tend to result in (positive) correlations between the jets. 

In the case that the correlations are large it may be necessary to systematically include 



corrections to Eq. 18 We anticipate that some kind of principal component analysis or form 
of tensor decomposition would be applicable. We leave this interesting question to future 
work. For the remainder of this section we would like to explore the degree to which the 



universality assumptions underlying Eq. 18 are valid in the only data sample available to 



us, namely the 4.8 million SHERPA events described in Sec. 3.2.2 

In a realistic experimental study one would presumably want to measure da4^j{pTi) and 
pj{x,n;pT) from independent samples. Given our somewhat limited statistics, however, 
we will instead 'measure' da^j^pTi) and pj{x,n;pT) from the same 4-jet sample and use 



Eq. 18 to construct an estimate of the full 9-dimensional distribution. This will allow us 



to estimate acceptances after imposing $rp^ Mj and Nj cuts. The degree to which this 



^ See e.g. Figure 2 in ref. [13]. 
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Mj,„,n (GeV) A^CA,mm 

FIG. 7: Testing the jet template ansatz. The figure on the left compares the raw Mj cut 
acceptance (red) to the template estimate (blue). The figure on the right is analogous, 
with a sliding A^^ca cut and a fixed cut Mj > 280 GeV. The red uncertainties are statistical 
(the statistical uncertainties for the template estimate are not shown, since by construction 
they are parametrically smaller than the raw uncertainties). 



procedure reproduces cut acceptances in the raw event sample will refiect the viability of 
the jet template and template ansatze. 

Given our somewhat limited statistics, it is difficult to judge whether deviations between 
the raw and template cut acceptances (see Fig.[7]) are an indication of deviations from Eq. 18 



or just statistical fiuctuations. Nevertheless, the approximate agreement over 7 orders of 
magnitude of cut acceptance in Fig. [7] is a promising result, although more sophisticated 
statistical methods would likely be required for a robust experimental analysis. 

When the jet template and template ansatze are appropriate, they have the advantage 
of reducing the statistical uncertainties in (i(T4j(^j., m^, rij). This follows directly from the 
reduced dimensionality of the problem. This reduction is especially significant in the tails 
of the distribution, where statistical uncertainties are parametrically reduced by virtue of 
the fact that — due to the convolution — they receive sizeable contributions from statistically 
rich parts of the component probability distributions. For example, a rare 4-jet event with 
Nj = 12 will be dominated by contributions from four jets with three subjets each. The 
probability of a single fat jet having three subjets at a given m/px and pt can be measured 
(or calculated) readily. 

Note that in the above we have discussed template methods in the context of LHC 
data. Another possibility is to use template methods to extend Monte Carlo calculations — 
indeed this is precisely what we did in the previous section for the specific case of missing 
energy. In the context of Monte Carlo template methods have the obvious advantage of 
reducing statistical uncertainties in the tails of distributions. They also offer the possibility 
of extending lower multiplicity calculations to a higher multiplicity regime in the following 
sense. Take for example SHERPA, which can generate up to rij = 6 jets in the final state 
at the matrix element level. Thus a SHERPA dijet sample will include jets with up to five 
partons generated through matrix elements. When a fat jet template obtained from such a 
dijet sample is combined with a 4-jet distribution da4j{pTi), the resulting distributions will 
extend the nominal reach of SHERPA beyond rij = 6. Of course, the theoretical validity of 
such a method is a delicate matter. Given the importance of having reliable Monte Carlo 
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estimates in the tails of distributions, however, such an approach deserves further study. For 
example, it would be important to investigate correlations between fat jets. Since SHERPA 
extends to rij = 6, one would be able to, among other possibilities, study 3-jet events and 
observe what happens as fat jets come closer together. 



5. RESULTS 

This section investigates the benefit of incorporating a subjet counting observable, namely 



Nj, into high multiplicity searches based off the summed jet mass observable, Mj. Sec. 5.1 
discusses the models used to quantify the improvement in searches that results from in- 
corporating Nj. These are supersymmetric models whose phenomenology involves the pair 
production of gluinos that subsequently decay into the lightest supersymmetric particle. 
Both R-parity conserving and R-parity violating models are considered. We choose these 
benchmark signals because they are well known in the literature and are easy to implement 



in Monte Carlo event generators. Sec. 5.2 describes the signal and background distributions 



in signal-like regions. Sec. 5.3 describes the criterion that was used to create optimized 



search regions for the benchmark signals. Sec. 5.4| describes the expected sensitivity of the 



optimized search regions to the benchmark signals. Finally, Sec. |5.5 compares the optimized 
search regions to previous searches. 



5.1. Benchmark signals 

The goal of this work is to gain access to a large class of signals without specifically 
targeting any one signal. Nevertheless, it is useful to have some benchmark models to 
consider. While these benchmark models are plausible extensions of the Standard Model, 
more than anything else they are meant to exhibit features of theories that produce high 
multiplicity final states. For any single theory, there are numerous handles beyond large 
multiplicity that could allow for additional discrimination between signal and background. 
For instance, many of our benchmark signals have leptons and b-jets. These are powerful 
handles that can be used in conjunction with the methods in this article, but they are not 
generic to all signals that produce high multiplicity final states. Therefore, these additional 
handles will not be used. 

The eight benchmark models considered in this article arise from the pair production 
of gluinos. These benchmarks provide relatively straightforward ways of toggling the mul- 
tiplicity of final state partons within a class of models that is easily implemented in all 
the standard Monte Carlo calculation packages. Each benchmark model is generated using 
MadGraph 4.5.1 [251427] and with up to two additional jets: 

PP ~^ 99 + TT-gj with Hg < 2 (19) 

The MLM matching scheme with a shower-A;_|_ scheme was used to account for the extra 
radiation. The events were showered and hadronized using Pythia 6.4 [23|. K factors for 
the signals were calculated using Prospino 2.1 [39j. 

A collection of signals with diverse phenomenology is considered in order to better ex- 
plore/delineate the efficacy of the subjet techniques used in this paper. This diversity arises 
through the variety of gluino decay topologies that are possible. The gluino can decay to 
light or heavy quarks; it can decay directly to the LSP or instead through a cascade. The 
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TABLE 3: The eight benchmark signals used in this paper. The numbers in parentheses 
indicate the number of final state partons added by choosing that particular branch of the 
decay topology. 



theory can be R-parity conserving or R-parity violating. Decay topologies with cascade 
decays or decays involving top quarks as well as certain RPV topologies will lead to very 
high multiplicity events with 12 or more final state partons. Indeed one benchmark signal 
we consider {Qi) has a spectacular 26 final state partons. The expected sensitivities to all 



the signals presented here will be given in Sec. 5.4 



In more detail, all the processes outlined here start with a gluino decaying to either a 
light quark or a top quark pair and a neutralino: 

g ^ qqX or g ^ tix (20) 

The neutralino x may be the LSP xo or one of the heavier electroweakinos. For simplicity, 
only decays to the LSP and to the NNLSP X2 are considered, where the latter decay chain 
results in a 2- step cascade: 

X2 -> Vxi ^ VV'xo (21) 

Finally the LSP may or may not decay into jets. The constraints on R-parity violation 
are much weaker for decays into heavy flavor. If the LSP is lighter than 200 GeV, then 
the decays will be dominantly with the XijkU^DjDl flavor structure (ijk) = (2,3,2). The 
resulting decay topology is 

Xo-^qqq = cbs. (22) 

If the mass of the LSP xo is above 200 GeV, then it is possible for the dominant R-parity 
violating decay mode to be tbs, resulting in four more final state partons over the cbs decay 
mode. To keep the number of benchmark signals to a manageable number, we do not include 
this decay mode in any of our benchmarks. All of these possibilities taken together result in 
eight different gluino decay topologies that span a range of final state parton multiplicities, 
see Table 



^ Here we use the term parton in a loose sense that includes the leptons from gauge boson decays. 
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For all signals, we choose the LSP mass using the formula 

^xo = (23) 

For the signal models involving heavier electroweakinos we choose the intermediate masses 
as follows: 

^X2 = i^g + "^xo)/2 "^xi = i^X2 + "^xo)/2 (24) 

Note that if the gluinos decay to light quarks and the LSP, the final state will have 
between 4 (RP conserving) and 10 (RPV) partons, which will make these topologies hard to 
discriminate against the tt background, especially in the former case. In the case of cascade 
decays or decays involving top quarks, however, there will be at least 12 partons in the 
final state and the method outlined in this article should prove more effective. Moreover, 
if R-parity is violated, each LSP will decay to three quarks, thus adding 6 jets to the final 
state. Cuts on the total number of subjets could then provide a competitive replacement 
for MET cuts for these kinds of signals. 

These signals are simply meant as benchmark models to test the sensitivity of our search 
to high multiplicity final states. The search presented here should prove effective for any 
signal implying the existence of final states with 8 or more final state jets. 



5.2. Distributions for signals and backgrounds 

One of the goals of this paper is to investigate the degree to which cuts for the 
signals of interest can be substituted (more realistically, loosened) by requiring particular 
jet substructure. That this is challenging can be inferred from Fig. |8| which shows the 
distributions of the various backgrounds with the -^t distributions of two benchmark signals 
superimposed. Although the dominant QCD background is significantly reduced by a 
cut of order 150 GeV, any loosening of this cut dramatically increases the number of QCD 
events in the search region. 

The remaining backgrounds, most of which have intrinsic require additional cuts to 
be suppressed. As shown in Fig. |8j a cut on the sum of the jet masses of order 300 GeV 
is effective. That a cut on Mj does not exhaust the discrimination available from jet sub- 
structure can be seen in Fig. |9| which illustrates how even at large values of Mj the A'^ca 
and A'kT distributions of a typical high multiplicity signal are well separated from that of 
the QCD background. The observation of similar behavior for the N-subjettiness ratio 
PU] suggests that this separation should hold for real QCD data as well. 

Thus Nj cuts should be complementary to Mj cuts, allowing for the possibility that 
cuts could be loosened. This complementarity is made more explicit in the bottom row of 
Fig. |8| which shows the distributions of A^ca and A^kx for the various backgrounds and two 
benchmark signals after imposing and Mj cuts. Once these cuts have been imposed, 
the dominant remaining background comes from tf+jets (and to a lesser extent ^+jets), as 
it has the largest number of final state partons. In order for tt+jets to pass the cut, 
the bosons have to decay semi-leptonically; while to pass the Mj cut, the final state 
partons must be distributed in phase space such that they form massive jets upon fat jet 
clustering. The combination of all these cuts strongly suppresses the various backgrounds. 
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FIG. 8: Top Left: $rp distributions for signals and backgrounds after requiring four or 
more fat jets. Top Right Mj distributions, after requiring four or more fat jets and 
$x > 150 GeV. Bottom Left: Nqa distribution after requiring four or more fat jets, 
Mj > 280 GeV and $t > 150 GeV. Bottom Right: A^kT distribution after requiring four or 
more fat jets, Mj > 280 GeV and > 150 GeV. Stacked histograms show the SM 
backgrounds, which include top and single top (light brown), V + nj (light blue), diboson 
(light yellow), QCD (light green), and the remaining non-QCD backgrounds mentioned in 
SeclsT (light red). The distributions for a 600 GeV gluino in the Qi and Q3 topologies are 



shown in purple and black, respectively. Note that the Nqa and A^kT distributions for 
(with 20 final state partons) are not substantially different from Qi (with 12). 



5.3. Optimizing search strategies 



The simplified models introduced in Sec. 5.1 can be used to develop broad search strategies 



that cover the model space. This section describes the method that was used to construct 
the minimal number of signal regions necessary to cover the entire space of simplified models. 
The method used was introduced in ref. f41j, developed further further in ref. [42j, and is 
based off the variable "efficacy," which is defined below. 

In order to demonstrate the usefulness of Nj cuts, we present two separately optimized 
search strategies. The first uses only Mj and cuts, while the second uses Nj, Mj and 
cuts. Since the first set of searches is a subset of the second, the second will always do 
better. The degree to which the more complex search strategy can be judged superior (if 
at all) will depend on the resulting sensitivities, the number of search regions required, and 
the sorts of cuts favored by the introduction of Nj. 

The two search strategies are defined as 



C — {{$T min, min)} and C — {(Ngubjet min, $1 min, Mj min)}- 



(25) 
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Mj (GeV) Mj (GeV) 

FIG. 9: Nqp^ (left) and Nkj, (right) versus Mj for signal topology with rrig = 600 GeV 
(red band) and QCD background (blue band). Each band illustrates ±la standard 
deviation about the mean, which is denoted by a black line. The superimposed Mj 
distributions are for the signal (red) and the QCD background (blue). As defined in Sec|2} 
NcA and N^^ are the sum of the number of subjets of the four leading jets of each event. 
The events considered are required to have at least four jets with px > 50 GeV and the px 
of their leading jet must be greater than 100 GeV. 



where the values of each of the cut requirements are taken from the following sets: 

iVj^i„ G {0,...,16} 

^Tmin e {0,25,..., 600} GeV 

Mj rain G {0, 25, . . . , 1600} GcV . 

This results in 1625 search regions for C and 27,625 search regions for C. The optimized 
search strategies will make use of only a small subset of these search regions. 

A given signal region or set of cuts, Cj, will yield an expected limit on the cross section 
times branching ratio, a x B, for a given simplified model at the 95% CL. given by 

(a X B), = (26) 

Here e(M)j is the efficiency of Cj for the model M and L is the integrated luminosity, while 
A(i?)j is the maximum number of allowed signal events at the 95% CL. if B background 
events are expected after the cuts and in fact fit the data. We take 

A(5) = 2 X ^Stat(5)2 + (e,yst5)2, (27) 

where Stat(i?) is the Poisson limit on B and egyst is the systematic uncertainty. Throughout 
we will take egyst = 30%. The Monte Carlo statistical uncertainties in B and e(M)j, i.e. bB 
and 5e(M)j, are taken into account by making the replacements 

B ^ B + 5B and e(M)i ^ e(M)i - 6e{M)i (28) 

which result in conservative limits. 

The optimal limit on a model M is then given by 



(a X B)opt = {min(((T x B)i) : i G {1, A^'cuts}} , 



(29) 
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where the number of search regions is Ai'cuts = 1625 or 27,625 depending on whether Nj cuts 
are being used. It is natural to quantify the "goodness" of a cut Cj by how close it is to 
optimal. For this purpose, we introduce the efficacy of a cut 

etc;) = (30) 

This is the ratio of the expected limit on the production cross section using a particular 
cut Ci divided by the expected limit on the cross section using the optimal set of cuts. An 
efficacy of 1.0 is ideal. Thus the best search strategy for covering a collection of model 
points {Mj} will be a combination of cuts {Ci} such that S is close to one for each Mj for 
at least one of the Cj. This article will use S < Sent = 1.5 as the criterion for optimizing the 
number of search regions. That is, each model point M will be covered by at least one cut 
that yields a limit on a x B that is within a factor of 1.5 of the optimal limit. The efficacy 
approach has several advantages: 

• it ensures near optimal coverage over the range of signals; 

• it allows for a fair comparison between different sets of observables; 

• it allows for a reasonable comparison to the ATLAS high multiplicity search, which 
makes use of 6 search regions; and 

• each signal is grouped with like signals on the basis of which search region it is covered 
by. 

Finding a search strategy that covers all models with a desired efficacy is computationally 
challenging because the configuration space is enormous, with 2^'="*= possible search strate- 
gies. Since a brute force search is not feasible, we use a genetic algorithm to construct the 
minimal set of search regions needed to cover the entire space of models. This algorithm, 
which we find to be quite effective for the task at hand, is described in App. |A]and is based 
off a genetic algorithm described in detail in |12] . 



5.4. Expected sensitivity 

Expected sensitivities to the various benchmark signals at a/s = 8 TeV are depicted in 



Figures 10 12 These are presented as expected 95% exclusion limits onaxB (the production 
cross section times the branching ratio into that particular gluino decay topology) as a 
function of the gluino mass and for an integrated luminosity of 30 fb~^. As expected, the 
performance of the Mj + +Nj search depends strongly on the final state multiplicity 
as well as the intrinsic of the signal. 

The results of the subjet counting search are best revealed by comparing the optimal 
search regions for the Mj + search to those of the Mj + -\-Nj search. For the case 
of Nqp^ the former has 6 search regions, while the latter has 5 search regions, see Tables |4] 
and |5j Interestingly, the efficacy criterion groups the signals into roughly similar signal 
classes, with the difference that some of the Mj and cuts move around once Nca cuts 
are introduced. 

The first class of signals consists of Qo alone, has intrinsic from the stable (non-RPV) 



LSP and only 4 final state partons. Consequently the cuts (and expected limits, see Fig. 10) 
do not change substantially after the introduction of a (trivial) Nqa cut. 
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Search Region 


Models Covered 


Background (for 30fb~^) 




Mj 




Class 


rUg 


QCD 


tt 


V+jets 


Other 


Total 


1 


1000 







rrig < 1.0 TeV 


495 ± 61.5 


2.38 ± 0.69 


6.93 ± 2.73 


0.13 ± 0.10 


505 ± 62 


2 


1350 





Gi 


rrig > 1.0 TcV 


13.7 ± 1.5 


^0.1 


0.54 ±0.54 


^0.1 


14.3 ± 1.6 


3 


400 


400 


Go 
Gi 


rrig < 1.2 TcV 
0.8 TcV >mg> 1.1 TcV 


0.38 ±0.04 


16.63 ± 1.81 


14.30 ± 2.62 


4.40 ± 1.52 


35.71 ± 3.53 


4 


500 


200 


Gi 

t'2,3 


m,g < 0.8 TeV 
rUg < 0.9 TeV 


23.9 ±4.9 


54.6 ±3.3 


28.0 ± 5.6 


6.26 ± 1.52 


112.8 ±8.2 


5 


625 


425 


Go 
Gi 

G2,3 


TUg > 1.2 TeV 
TUg > 1.1 TeV 

nig > 1.3 TcV 


0.09 ±0.02 


0.59 ± 0.34 


0.73 ± 0.73 


0.47 ± 0.29 


1.89 ±0.86 


6 


725 


175 


G2,3 
^5,6,7 


0.9 TcV < nig < 1.3 TcV 

all 


5.28 ±0.72 


5.34 ± 1.03 


2.85 ± 1.08 


0.41 ±0.18 


13.87 ± 1.67 



TABLE 4: Search regions for the Mj + search with cuts in GeV and assuming 30% 
systematic uncertainties. For each search region Cj the column 'Models Covered' lists the 
benchmark models that are optimally covered by Q. The search regions are chosen using 
the efficacy criterion £ < 1.5. The background uncertainties shown are statistical. 



Search Region 


Models Covered 


Background (for 30 fb ^) 




Mj 






Class 


m~g 


QCD 


tt 


V+jets 


Other 


Total 


1 


450 


450 





Go 


all 


0.18 ±0.26 


8.31 ± 1.28 


2.05 ± 1.08 


0.64 ±0.26 


11.18 ± 1.70 


2 


1050 





13 


Ga 


all 


21.60 ±3.03 


^0.1 


^0.1 


0.03 ± 0.01 


21.63 ±3.03 


3 


475 


275 


11 


Gi 

G2 
G3 


all 

nig > 0.8 TcV 
nig > 0.9 TeV 


0.96 ±0.46 


4.16 ±0.91 


0.78 ±0.59 


0.03 ±0.01 


5.90 ± 1.18 


4 


525 


125 


12 


G2 

Gs 

Gbfi 


nig < 0.8 TeV 
rrig < 0.9 TeV 
nig > 0.9 TeV 


7.86 ± 1.92 


7.72 ± 1.24 


6.71 ± 4.58 


0.33 ±0.19 


22.65 ±5.11 


5 


425 


125 


14 


Gbfi 
Gi 


nig < 0.9 TcV 

all 


1.08 ±0.32 


1.19 ±0.49 


^0.1 


0.01 ± 0.01 


2.26 ±0.58 



TABLE 5: Search regions for the Mj + $t + Nca search with Mj and cuts in GeV 

and assuming 30% systematic uncertainties. For each search region Ci the column 'Models 
Covered' lists the benchmark models that are optimally covered by Cj. The search regions 
are chosen using the efficacy criterion S < 1.5. The background uncertainties shown are 
statistical. 



The second class of signals consists of alone, which differs from Qo in that the LSP 
undergoes the RPV decay x ^ cfos. Consequently there is no intrinsic (l^, and both search 
strategies cover Q4 with search regions that have trivial cuts. Since, however, ^4 is a 
high multiplicity signal with 10 final state partons, the corresponding Nqa search region 
imposes a significant cut A^ca > 13 with a loosened Mj cut. This results in an expected 
limit on a X B that is better by a factor 2 to 4 compared to the optimized Mj + $j' search 
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600 800 1000 1200 1400 

m^(GeV) 

FIG. 10: 95% exclusion limits on a x B for the Mj + search (dashed blue), the 
Mj+$rp-\-Ncjs^ search (solid red), and the Mj+^^+A^k^ search (dash-dotted brown) for 
signal Qq, which has only 4 final state partons. The exclusion limits given by the ATLAS 
high multiplicity search [43] (dash-dotted green line) and the CMS black hole search [44] 
(dashed black) as well as the NLO gluino production cross section (grey solid) are also 
shown. The systematic uncertainty on the background is assumed to be 30%. Note that 
the CMS limit is rescaled by a factor of 5. 



see Fig. 11) but that is nevertheless weaker than what would be needed to exclude the 



benchmark gluino cross section. 

The third class of signals consists of Q^, Qq and Qj. These signals have intrinsic from 
top quarks or electroweak gauge bosons produced in the gluino decay chain. They also have 
especially large final state multiplicities, since the LSPs at the end of the decay chain end 
in the RPV decay x ~^ cbs. The inclusion of a cut A^ca > 12 — 14 improves the expected 
limit by a factor of 2 — 5 depending on the specific signal and gluino mass. The cut 
is loosened by 50 GeV, while the Mj cut is lowered by 200 — 300 GeV. This represents a 
modest success in trading our reliance on $rp cuts for a more refined use of jet substructure 
observables. 

The fourth and final class of signals consists of Qi, Q2 and ^3. These signals have large 
intrinsic $t because the LSPs at the end of the gluino decay chain are stable and because 
top quarks and/or electroweak gauge bosons are produced in the decay chain. The top 
quarks and electroweak gauge bosons also ensure that the final state multiplicity is high. 
The inclusion of a A^ca cut improves the expected limit on a x i3 by a factor 2 — 4 for low 
and intermediate gluino masses, with little or no improvement at large gluino masses. The 
inclusion of a A^ca cut also loosens (in most places) the requirements on Mj and by 
50 — 200 GeV and 50 — 150 GeV, respectively. This demonstrates that for these signals with 
significant 4't there is more room for loosening $rp requirements in favor of Nj requirements. 
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FIG. 11: 95% exclusion limits on a x B for the Mj + search (dashed blue), the 
Mj+$rp+Ncjs^ search (solid red), and for the Mj+^T+-^kT search (dash-dotted brown) for 
signal ^4, which has 10 final state partons and no intrinsic The exclusion limit given 
by the CMS black hole search (dashed black) as well as the NLO gluino production 
cross section (grey solid line) are also shown. The systematic uncertainty on the 
background is assumed to be 30%. The ATLAS limit is not shown because it is orders of 
magnitude worse than the others due to its strict $t requirement. 



5.5. Comparison with previous searches 

This section presents a comparison of the techniques proposed in this article to previous 
searches. This is not meant to be a complete survey of all the searches that have been 
performed at the LHC and that are sensitive to high multiplicity signals. Two searches are 
considered. The first, presented in Sec. 5.5.1 , is an ATLAS search that requires up to 9 



R = 0.4 jets with missing energy. The second search, presented in Sec. |5.5.2[ is a search for 
"black holes" at CMS. These two searches are different attempts at gaining access to high 
multiplicity final states. We find that the methods presented in this article are competitive. 



5.5.1. ATLAS High Multiplicity Search 

A comparison is made with ATLAS's most up-to-date high multiplicity search, which 
makes use of 5.8 fb"^ at 8 TeV |13]. This search clusters events into R = 0.4 anti-fcr jets 
and looks at 6 search regions: 

• nj > 7 with pt > 55 GeV 

• rij > 8 with Pt > 55 GeV 

• rij > 9 with Pt > 55 GeV 

• nj>Q with Pt > 80 GeV 
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FIG. 12: 95% exclusion limits on a x B for the Mj + search (dashed blue), the 
Mj+^T+NcA search (solid red), and the Mj+^x+Nk^ search (dash-dotted brown) for 
the R-parity conserving topologies Qi, Q2, and (left, top to bottom) and the 
corresponding RPV ones, ^5, Qq, and Q7 (right, top to bottom). The exclusion limits given 
by the ATLAS high multiplicity search [43j (dash-dotted green) and the CMS black hole 
search [33] (dashed black) as well as the NLO gluino production cross section (grey solid 
line) are also shown. The systematic uncertainty on the background is assumed to be 30%. 
Note that the CMS limit is rescaled by a factor of 5. 
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• nj>7 with pt > 80 GeV 

• nj>8 with Pt > 80 GeV 

It is further required that events contain no isolated leptons and that fl^ / a/ Hj- > 4GeV2. 
Note that for an event with Ht = 1000 GeV (3000 GeV) the latter cut corresponds to a fJ^ 
cut of 126 GeV (219 GeV). 

In order to compare the performance of the ATLAS search to the optimized search strate- 
gies in Tables [i] and [sj we assumed that the ATLAS search could be scaled up to 30 fb~^ while 
keeping the cuts fixed. The re-estimated expected limits have been computed by linearly 
rescaling the expected number of background events and the corresponding uncertainties (as 
given by ATLAS) to the new luminosity and computing the exclusion limit as outlined in 



Sec 5.3 Such a linear rescaling, which assumes that systematic uncertainties do not come 
to dominate the limits, is probably overly optimistic. 

We find that for those benchmark signals with large final state multiplicities and intrinsic 
(^1,2,3 and ^5,6,7), i-e. the sorts of signals that the ATLAS search is designed to be 
sensitive to, the Mj+^rp-\-NcA search generally outperforms the ATLAS search, particularly 
at higher gluino masses, where the exclusion limit on a x B is improved by a factor 2 — 5. 
For these benchmark signals this corresponds to an extended gluino mass reach of order 
100 GeV to 250 GeV. While it is difficult to know the extent to which this promising result 
can be realized at ATLAS or CMS, it is worth emphasizing that whatever the final search 
sensitivity should turn out to be, it is already valuable to have a search strategy that is 
governed by different systematic uncertainties. 



5. 5. 2. CMS Black Hole Search 

The CMS black hole search [44j, which makes use of 4.7 fb~^ at 7 TeV, is also sensitive 
to high multiplicity final states. This search makes use of 16 search regions, each of which 
corresponds to different St mm and A^min cuts. Here St mm £ [1.9, 4.1] TeV is a cut on 
the scalar sum over transverse energy and A'min £ [3, 7] is a cut on the total number of 
reconstructed objects with Et > 50 GeV. See ref. [S] for details. 

In order to compare the expected performance of the CMS search to the optimized search 
strategies in Tables |4] and [5} it is necessary to extrapolate the CMS background estimates 
from 7 TeV to 8 TeV. Because there are no missing energy requirements, the background is 
completely dominated by QCD events. The absence of any intrinsic high energy scale in the 
background allows us to adopt the following approximate extrapolation. For each value of 
Amin, CMS provides the expected number of background events as a function of St, which 
we fit to an exponential, 

iVSD^(^T;Ar_) = e-"(^-4°') (31) 

where a and s!^^ are fit parameters. The number of background events as a function of St 
at 8 TeV is then estimated to be 

A^Sd^(5t; AT^in) = AT^^D^ X St; N^m^ (32) 

These background estimates can then rescaled to 30 fb^^ and combined with the efficiencies 
of the benchmark signals in each of the 16 search regions to obtain the expected sensitivity 
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of the CMS search at 8 TeV. While these background estimates are inexact, they are suffi- 
cient for demonstrating that the CMS search results in expected limits that are about two 
orders of magnitude weaker those obtained by a Mj+^^+Nqa search (see Fig. 12). This is 
because for the gluino masses that are accessible at 8 TeV, the St cuts are highly inefficient, 
and the absence of missing energy requirements results in large QCD backgrounds. This 
demonstrates that high multiplicity searches targeting black holes are not necessarily well 
suited for other kinds of high multiplicity signals. 



6. DISCUSSION 

Recent years have seen an impressive amount of research on a large variety of jet sub- 
structure techniques.^ The majority of this work has focused on the development of either 
general purpose tools (jet grooming, top tagging, etc.) or jet substructure analyses tailored 
to specific search channels (e.g. the BDRS boosted Higgs search [22j). One area that has 
seen less work is the design of search techniques for topologies that are more complicated or 
whose structure is not known a priori. In this paper we have taken a step in this direction 
by arguing that jet substructure suggests a different approach to counting jet multiplicities 
that results in an effective search strategy that is sensitive to a variety of high multiplicity 
topologies. 

The flexibility inherent in this approach raises the possibility of loosening missing energy 
cuts in favor of well chosen jet substructure cuts. This is of special interest for new physics 
scenarios in which signals exhibit little or no intrinsic missing energy, such as supersymmetric 



scenarios with baryonic RPV. In Sec. 5^ we have seen that for signals with large final state 
multiplicities and some (though not necessarily very much) intrinsic the introduction 
of Nqa cuts does in fact lead to lower requirements. While this represents only a 
modest push towards the regime of (near)-vanishing requirements, it is nevertheless 
an encouraging result given how effective requirements are in reducing the huge QCD 
backgrounds. In fact, we find that trading cuts for A^^ca cuts is particularly effective for 
the QCD background — it is the need to suppress the tt+jets background that prevents the 
(ij- cuts from being loosened further in Table [s] We anticipate that if additional handles 
were introduced to combat the tt+jets background (e.g. vetoing on b-jets), then cuts 
could be loosened even further. We have not pursued this interesting direction here, since 
our goal was to keep the search strategy as inclusive as possible. 

One possible concern with high multiplicity searches at the LHC is their potential sensi- 
tivity to pile-up, something that becomes more pressing as the LHC pushes towards higher 
and higher luminosities. In this paper we have advocated the use of jet trimming to reduce 
the present search's sensitivity to pile-up, but something like the technique introduced in 
ref. [16] might also be necessary. This whole issue would need to be revisited if the search 
were to be performed by ATLAS or CMS. This is particularly the case because it is impossi- 
ble for us to thoroughly examine pile-up effects given their sensitivity to detector effects and 
the fact that each collaboration has detector specific methods for mitigating pile-up effects. 
It is worth pointing out that one possible advantage of A'ca is that it includes some built-in 
jet grooming by virtue of the veto it imposes on asymmetric sub jet energy sharing. 

In conclusion, we have seen that an effective search strategy can be developed by exploit- 



^ For a comprehensive set of references see the BOOST 2010 & 2011 proceedings |45j . 
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ing missing energy, a sum over fat jet masses, and a sum over fat jet subjet counts. The 
two subjet counting algorithms presented, A^ca and iV^T, yield comparable results, so that a 
choice between the two would need to be guided by experimental studies (with a particular 
focus on inherent systematic uncertainties, performance under pile-up, etc.). Other subjet 
counting algorithms are possible,^ but what we would like to stress here is that, as has 
been seen in many other jet substructure studies, the flexibility of the fat jet approach is 
very powerful. In this case the potential for systematic data driven estimates of the QCD 
background is of particular importance. As another example of the flexibility of the fat 
jet approach, we refer the reader to the related work in ref. |17], which focuses on high 
multiplicity hadronic final states with vanishing missing energy. 
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Appendix A: Genetic algorithm for optimizing search regions 

The genetic algorithm is initialized with 1000 search strategies, where each search strategy 
is a set of search regions. Each of these search strategies is formed as follows. First a random 
selection of 40 of the Ai'cuts search regions is chosen. Each of these 40 search regions is assigned 
a weight proportional to the number of models it covers with E < Sent- Finally, 1000 search 
strategies are created by sampling (without replacement) from these 40 search regions. This 
gives a slight preference in the initialization stage to search regions that are sensitive to 
more models. The exact initialization procedure is not critical for rapid convergence of the 
algorithm 

The search strategies are evaluated to see how many models they cover within the desired 
efficacy, and a "fitness" is assigned to them with the formula 

nC.M)^j,^-^y^^, (Al) 

where M is the number of models covered, C is the number of search regions in the search 
strategy, and M^ax is the total number of models. This fitness function strongly penalizes 
search strategies that do not cover all models, followed by a penalty for having too many 
search regions. 



We have investigated the possibility of a subjet counting algorithm based on N-subjettiness [T2j, using a 
boosted decision tree to map tn space to different subjet multiplicities. Although the resulting algorithm 
performed worse than iVcA and A^^x , it is possible that a more thorough study could lead to improvements. 
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After evaluating the fitness of the search strategies, the least fit 50% arc removed. Pairs 
of fit search strategies are then selected and a new search strategy is created by taking a 
randomly determined fraction of each search strategy's search regions. For instance, if the 
two selected search strategies had A^i and N2 search regions, then a uniform random number 
on the unit line segment, x, would determine that xNi search regions would be taken from 
the first search strategy and (1 — x)N2 would be taken from the second search strategy. So 
if A^i — 20 and N2 = 30 and x = 0.20, 4 search regions would be taken from the first search 
strategy and 24 would be taken from the second. If duplicate signal regions are selected, the 
duplicate is removed, reducing the number of search regions. After creating a new search 
strategy, the search is mutated to guarantee that the population of search strategies has 
sufficient diversity. Each search region within a search strategy has a finite probability of 
being changed to another random search region. We use 6% for this probability known as 
the "mutation rate". Thus for the 16 search regions in the example, 1 change would be 
made on average. 

If after ten consecutive generations no progress has been made, i.e. if no solution has been 
found that covers the entire model space, then a solution is manually created by forcing every 
model to be covered by some search region. This can be done by increasing the number of 
search regions in the search strategies until full coverage is achieved. Finally, if every model 
is covered and no further progress is achieved for seven generations, search strategies are 
scoured to see if any search regions can be removed without reducing coverage. Either way, 
the genetic algorithm is restarted. If no progress in reducing the number of search regions 
in a search strategy has been made in twenty generations, the program ends. 

Typically, the algorithm converges after 20 to 30 generations, and 10 to 30 distinct op- 
timized search strategies are found each time. While the termination of the program does 
not guarantee that the optimal solution has been found, re-running the program multiple 
times usually results in the same number of required search regions. The resulting search 
strategies typically have similar features even if they differ slightly in detail. 
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